Semantic Search: Document Ranking and Clustering Using Computer Science Ontology and N-Grams
نویسندگان
چکیده
Semantic similarity has become an important tool and widely been used to solve traditional Information Retrieval problems. This study adopts ontology of computer science and proposes an ontology indexing weight based on Wu and Palmer’s edge counting measure and uses the N-grams method for computing a family of word similarity. The study also compares the subsumption weight between Hliaoutakis and Nicola’s weight and query keywords (Decision Making, Genetic Algorithm, Machine Learning, Heuristic ). A probability value (p-values) from the t-test (p = 0.105) is higher 0.05, which indicates the evidence of no of no significant differences between the two weights methods. The experimental results show the new keyword-keyword similarity matrix scores that compute from hierarchical relationship weight based on Computer Science ontology and string matching (tri-grams) for computing of string of keyword. We computed the document-document similarity matrix scores using our keyword similarity matrix scores and compared them with the keyword matching weights using Dice coefficient method. In addition, this paper, we presented a new document semantic ranking process for the semantic ranking that proposes a new weight of query term in the document based on Computer Science Ontology weight. The experimental results show that the new document similarity score between a user’s query and the paper suggests that the new measures were effectively ranked.
منابع مشابه
Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology
Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...
متن کاملCentralized Clustering Method To Increase Accuracy In Ontology Matching Systems
Ontology is the main infrastructure of the Semantic Web which provides facilities for integration, searching and sharing of information on the web. Development of ontologies as the basis of semantic web and their heterogeneities have led to the existence of ontology matching. By emerging large-scale ontologies in real domain, the ontology matching systems faced with some problem like memory con...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملRRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features
Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...
متن کاملAn Ensemble Click Model for Web Document Ranking
Annually, web search engine providers spend more and more money on documents ranking in search engines result pages (SERP). Click models provide advantageous information for ranking documents in SERPs through modeling interactions among users and search engines. Here, three modules are employed to create a hybrid click model; the first module is a PGM-based click model, the second module in a d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JDIM
دوره 12 شماره
صفحات -
تاریخ انتشار 2014